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Abstract 

This paper addresses asymptotic properties of general penalized spline estimators with 
an arbitrary B-spline degree and an arbitrary order difference penalty. The estimator is 
approximated by a solution of a linear differential equation subject to suitable boundary 
conditions, ft is shown that, in certain sense, the penalized smoothing corresponds ap- 
proximately to smoothing by the kernel method. The equivalent kernels for both inner 
points and boundary points are obtained with the help of Green's functions of the dif- 
ferential equation. Further, the asymptotic normality is established for the estimator at 
interior points. It is shown that the convergence rate is independent of the degree of the 
splines, and the number of knots does not affect the asymptotic distribution, provided 
that it tends to infinity fast enough. 
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1 Introduction 

Consider the problem of estimating the function / : [0, 1] — ► R from a univariate regression 
model yi = /(U) + ej, i = 1, . . . , n, where the U are pre-specified design points and the 
are iid normal random variables with mean and variance a 2 . This paper presents a local 
asymptotic theory of penalized spline estimators of /. 

The penalized spline regression model with difference penalty was introduced by Eilers 
and Marx (1996), who coined the term "P-splines", but using less knots for the regression 
problem can be traced back at least to O'Sullivan (1986). Penalized spline smoothing has 
become popular over the last decade and the uses of low rank bases lead to highly tractable 
computation. The methodology and applications of P-splines are discussed extensively in 
Ruppert, Wand and Carroll (2003). On the other hand, asymptotic properties of the P- 
spline estimators are less explored in the literature. A few exceptions include recent papers 
such as Hall and Opsomer (2005), Li and Ruppert (2008), and Claeskens, Krivobokova, and 
Opsomer (2009). Hall and Opsomer (2005) placed knots continuously over a design set and 
established consistency of the estimator. Li and Ruppert (2008) developed an asymptotic 
theory of P-splines for piecewise constant and linear B-splines with the first and second order 
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difference penalties. Claeskens, Krivobokova, and Opsomer (2009) studied bias, variance and 
asymptotic rates of the P-spline estimator under different choices of the number of knots and 
penalty parameters. An interested reader may also refer to Pal and Woodroofe (2007), Shen 
and Wang (2009), and Wang and Shen (2009) for shape constrained regression estimators and 
their applications. 

The P-spline model approximates the regression function by /W(x) = £f=i +P b k B^\x), 
where {B^ : k = l,...,K n + p } is the pth degree B-spline basis with knots = kq < 
«!<•••< KK n = 1- The value of K n will depend upon n as discussed below. The spline 
coefficients b = k = 1, . . . , K n +p} subject to the mth-order difference penalty are chosen 
to minimize 

n K n +p K n +p 

£[w- £ hB^W + y £ [A-(y] 2 , (i) 

i=l k=l k=m+l 

where A* > and A is the backward difference operator, i.e., Ab^ = bk — bfc-i and 

A m b k = AA m -% = ■■■ = ^(-l) m " J ( . b k - m+r (2) 

i=o \ J ' 

For simplicity, we assume that both the design points and the knots are equally spaced on the 
interval [0,1]. We also assume that n/K n is an integer denoted by M n . Hence every M n th 
design point is a knot, that is, Kj = tjM„ for j = 1, . . . , K n ; a more general case is discussed 

briefly in Sectional The P-spline estimator is given by f^ p \x) = J2k=i P bkB^\x). 

This paper develops a general asymptotic theory of P-splines under an arbitrary choice 
of p and m. It is shown that the P-spline estimator can be approximated by the solution of 
an ordinary differential equation (ODE) with suitable boundary conditions. This estimator 
is then shown to be described by a kernel estimator, using a Green's function obtained from 
a closely related boundary value problem as a kernel. The asymptotic properties of the 
estimator thus are explicitly established based on the Green's function and the solution of the 
differential equation. It is worth mentioning that asymptotic analysis of smoothing splines 
using Green's functions was performed by Rice and Rosenblatt (1983), Silverman (1984), 
Messer (1991), Nychka (1995) and Pal and Woodroofe (2007). However, these papers only 
treat limited special cases. In contrast, the current paper develops a general framework for 
P-splines. This framework leads to a relatively simpler approach to obtain a closed- form 
expression of an equivalent kernel for both inner points and boundary points at the first 
time. Further, we show that the convergence rate of depends only on m but not on p, 
as long as K n tends to infinity fast enough; see Corollary 14. II where K n is of order n 7 , where 
7 > (2m - l)/(4m + 1). 

The contributions of the present paper are twofold: (i) the paper develops a general 
approach for asymptotic analysis of a P-spline estimator with an arbitrary spline degree 
and arbitrary order difference penalty via Green's functions. To handle a general P-spline 
estimator, various techniques for linear ODEs are exploited to obtain a corresponding Green's 
function, (ii) the closed-form expressions of equivalent kernels for both inner and boundary 
points are established and convergence rates are developed for general P-spline estimators. 
Compared with the existing results based on matrix techniques, e.g. Li and Ruppert (2008) 
and Claeskens, Krivobokova, and Opsomer (2009), the use of Green's functions considerably 
simplifies the development and yields an instrumental alternative to establish the equivalent 
kernels for general P-splines. Moreover, this also leads to the convergence rates and the 
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observation that the rates are independent of the splines' degrees and the number of knots 
for an arbitrary P-spline estimator. While this observation is pointed out by Li and Ruppert 
(2008) for piecewise constant and piecewise linear splines and is conjectured for general P- 
splines, no rigorous justification has been given for general P-splines in the literature; the 
current paper offers a satisfactory answer to this issue in a general setting. 

The paper is organized as follows. Section [2] characterizes the general P-spline estimator 
as an approximate solution of a linear differential equation subject to suitable boundary 
conditions. Section [3] investigates the solution of such the differential equation and obtains 
the related Green's functions as equivalent kernels for a P-spline estimator of an arbitrary B- 
spline degree with any order difference penalty. Using these Green's functions, the asymptotic 
properties of P-splines are established in Section [H Section [5] addresses kernel approximation 
near the boundary of the design set. By formulating boundary conditions as an appropriate 
integral form, an explicit equivalent kernel is obtained. Finally, extensions to unequally spaced 
data and multivariate P-splines are discussed in Section [6l 



2 Characterization of the estimator 



Let X = [B k (xi)] E R» x (*»+p) be the design matrix, and let D m E ^(K+ P -m)x(K+ P ) be the 
mth-order difference matrix such that D m b = [A m (6 m +i ),..., A m (b Kn+p )] T . The optimality 
condition is given by 

(X T X + X*Dj n D m )b = X T y, (3) 

where y = (yi, ... ,y n ) T . 

To characterize the P-spline estimator we introduce more notation. Define C E 

c n+P )x(K n +p) and Q e R (*T„+p)xn ; respectively, as 
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where 1 = [1, 1, • • • , 1] T € 



)M„xl 



. Since C is invertible, for any k E N, ([3]) is equivalent to 



X*C k D^D m b + C k X T f = C k X T y, 



(4) 



where / = [/W (xi ),..., f ]p \x n )] T and C k = CC---C . The matrix D^Dm is a banded 

k- copies 

symmetric matrix. Except for the first m and last m rows, every row of D^Dm has the 
form (0,--- ,0,uZ,ut,--- .w^O,--- ,0), where u* = (-1)™(- l) 2m ^{ 2 f), j = 0,...,2m. 
Moreover, except for the first m — k and last m rows, the ith row of C fe -D^-D m has the form 

( £ ' 'J ' ^ W 0'"' ! W 2m-t, 0; • - , 

(i-m+fc-i)-copies (K n +p)-(i+m)-copies 
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where 

co 1 = (-ir(-l) 2m ~ k ~^ 2m ~ k ), j = 0,...,2m-k. (5) 



Further, the elements of the last k rows of C D^Dm are all zeros. In particular, when k = m, 
C m DlD m b=(-l) m [A m b m+1 ,A m b m+2 ,--- ,A m b Kn+p ,0,...,0] T . (6) 
It is also interesting to note the derivative formula for B-spline functions (de Boor, 2001) 



w K n +p K n +p 

^ £ b k BP (x) = £ K l n A l b k Btf\x), I < p. (7) 

' k=l k=l+l 



Hence, 



and therefore, 



, m K n +m K n +m 
fc=l fe=m+l 



1 d m 

A m b m+k = — — fi m \x), x €(«*_!,«*], fc = l,...,X n . (8) 

Let wi be the uniform distribution on and U2 be the uniform distribution 

on m,..., kk u - Let g and / be two piecewise constant functions for which g(x k ) = y k and 
f{x k ) = f(x k ) for k = 1, ...,n, respectively. Let Gi(x) = g(t)dwi(t), Fi(x) = f(t)duji(t), 
F\(x) = Jq f(t)dt, and for k > 2, define 

/-a: 

Gfc(s) = / G*_i(t)«Lfc(t), F fc (x) = / F k -l{t)duj 2 {t), F k (x) = / F k ^{t)dt. 
Jo Jo Jo 

To obtain the analogous representation for /, we introduce a few variables and functions 
related to the true regression function /. Define < &i(x) = J f(t)dt, $i(x) = f£ f(t)duii(t), 
and for k > 2, 

= / * fc _i(*)df, $ fc (x) = / ^ fc -i(i)do;2(t). 

Letting i? = C X T - C, we have C m X T f = C^Cf + C^Rf. Therefore, the jth row of 
©, when k = m, can be written as 

A* 

F m {K j+p - X ) + R fj + {-l) m —— zl A m b m+j = G m ( Kj+p _i) + Ryj, j = l,..., K n , (9) 

where Rfj and are the jth row of C m ~ 1 Rf and C m ~ 1 Ry, respectively. 

Furthermore, since the elements of the last k rows of C k D'f n D m are all zeros, we also have 

F fc (l) = Gfc(l), fc = l,...,m. (10) 

Next, we proceed by replacing that difference equation ([9]) by an analogous differential 
equation. We shall focus on the case when p = m first; the case when p ^ m will be discussed 
in Section [H For any x G [0, 1], letting k x = [-Kn^J + 1, © gives 

A* 

F m {K kx+p ) + Rf, kx +i + {-l) m T . m _ 1 A m b m+kx+1 = G m (K kx+p ) + R y , kx+ i- (11) 
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Define 

R(x) = F m (x) - G m {x) + G m {K kx+p ) - F m (K kx+p ) + Ry,k x +i ~ Rf,k x +i- (12) 
Then, from ([8]) and (jlip . F m solves the ordinary differential equation 

(-l) m aF£ m \x)+F m {x) = G m (x)+R(x), < x < 1, (13) 
where a = X* / ' (nK 2 " 1 ' 1 ) . We have 2m boundary conditions for f)13|) : 

F«(0)=0, F«(l) = G m _ fc (l)+e m _ t) fc = 0,...,m-l, 

where e m _^ = i^f^l) — F m ^ k {l). We shall show that is stochastically bounded, therefore 
the e k are small with an order of O p (l/n). 

3 Green's functions 

The solution to can be represented by a corresponding Green's function explicitly. It 
shall be shown that the P-spline estimator can be approximated by a kernel estimator, using 
the corresponding Green's function. For this end, consider the differential equation 

{-l) m aF^ 2m \t) + F{t) =G(t), 0<t<l, (14) 

subject to the boundary conditions F^(0) = and F^\l) = GW(1), i = 0, . . . , m — 1. Let 
(5 = a^ 1 ^ 2 " 1 ^. We consider two cases: (1) m is even; and (2) m, is odd. 

3.1 Even m 

In this case, the characteristic equation is given by X 2m + (3 2m = 0, and we obtain 2m 
eigenvalues 



Let 



(l + 2k)ir (l + 2k)ir 

cos h 2 sin 

2m 2m 



, fc = 0,l,-- - ,2m-l. 



(l + 2/c)vr (1 + 2&W 

/ifc = cos and uj k = sin . 

2m 2m 

Then the homogeneous ODE: aF^ 2m \t) + F(t) = has 2m solutions 

e (± w ±w fc )/» = e±/»w* [ cos(/3w fe t) ± » sin(/5c; fc t)] , k = 0, • • • , j - 1, 

where > and > for A; = 0, • • • , ^ — 1. 

To find the corresponding Green's function for the ODE: aF^ 2m \t) + -F(i) = G(t) on 
[0, 1] , we define the following function 

m i 

2 J" 

L(t) = Pe-^lck cos(cj k pt) + d k sm{u k pt)], (15) 

k=0 
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where the coefficients c k , d k are to be determined, and K (t, s) = L(\t — s|). Since L is a linear 
combination of the solutions of the homogeneous ODE, L( 2m ) + /3 2m L = also holds. Let 



F (t)= f K(t,s)G(s)ds, i€[0,l]. 
Jo 



By noting Fo(t) = J* * L(t — s)G(s)ds + L{s — t)G{s)ds for all t G [0, 1], it is easy to verify 
that if 







2rn 



(16) 



L^ k \t)\ t=0 = 0, V k = 1,3,- •• ,2m- 3, and L (aro " 1) (*)| t= o 

then F (t) is a solution of aF' 2 ™) + F = G. 
To find the coefficients c k ,d k , define 

p k (t) = e _/3Mfct [c fc cos{oj k (3t)+d k sm(oj k /3t)), q k (t) = e~^ kt [-c k sm(uj k (3t)+d k cos(oj k /3t)] . 

Hence p k (0) = c k and ^(0) = d k . Since 



(Mt)\ = 



-UJ k —fl k 



Pk(t) 
Qk(t) 



(17) 



we have 



3 fPk(t) 



where p^\t) and q k \t) stand for the j-th derivatives of p k and q k respectively. Letting 
A J k (i,£) denote the (i, ^-element of Ai, we obtain the following linear equation for {c k ,d k } 
from (1161): 



4>(M) 
^(1.1) 



A (l,2) 
AUl,2) 



AWl.l) A^ (1,2) 



4 2m_3) (i,i) 4 2m_3) ( 1 . 2 ) A ( |™7 3) (i,i) A ( |™7 3) (i,2) 

4 2m_1) (i,i) 4 2m-1) ( 1 . 2 ) ••• • 



A ( l m 7 1} (l,l) A£"^(l,2) 



(2m-l) , 
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"0" 
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2 A 









L 2 




1 

-2- 



(18) 



It shall be shown in Lemma 13. II that the above equation has a unique solution. 
3.2 Odd m 

The characteristic equation is given by X 2m — f3 2m 



= and the eigenvalues are: 
k = 0,l,--- ,2m -1. 
Then the homogeneous ODE: aF^ 2m \t) + F(t) = has 2m solutions: e ±/3 * and 



Afc = pi cos h i sm — 

m m 



e (±/, fc ±ro; fc )/3i = e ±/3 Mfc t [ ± , gm^i)] , jfe = 1, 



m — 1 
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where uu = cos — > and wt = sin — > for k = 1, 
define 

(m-l)/2 



to— 1 
' 2 



. Similar to the even case, 



P(t) = c /3e _/3t + ^ e-^ kt [ c k cos(w fc /3t) + d k sm(uj k pt)] . 



(19) 



k=l 



where the coefficients c k ,d k are to be determined, and P(t) satisfies p( 2m \t) — f3 2m P(t) = 0. 
Let K(t, s) = P{\t - s\) and F Q (t) = J* K(t, s)G(s)ds. It can be verified that if 



P (fe) (t)| t=0 = 0, V fc = l,V-- ,2m -3, and pV 2 ™-^ (t)\ t __ 



2(n 



(20) 



then -Fo(i) is a solution of aF^ 2m ^ — F = —G. Similarly, it can be shown that P is also a 
2mth-order kernel. To find the coefficients Co and Ck,dk, we may use p k , qt and A k introduced 
in the last subsection. Indeed, we obtain the following linear equation for cq and {c k , dk} from 
J2Q]): 



-1 
-1 



A(l,2) 
A?(l,2) 



A^(l,l) An^(l,2) 

A%^(1,1) ^(1,2) 



-1 A 



(2m-3) 



(1,1) 4 2m " 3) (b2) 



AS?r 3) (i,i) A^r^(i,2) 



-i 4 2m - 1} (i,i) 4 2m_1) (i,2) 



(2m-3) 

m — 1 

(2m-l) 



(1,2) 





c 




" " 




Cl 


















C7T1-1 









2 




1 

. 2. 



.4° 



(21) 



3.3 The equivalent kernels 

Lemma 3.1. Each of the equations t!8\) and \21}) has a unique solution. 

Proof. We introduce some trigonometric identities to be used in the proof. Let p,q £ 
N. By observing sin(-0) £fc=l cos[(2/c - 1)0] = \ YX=\ [sin(2(fc - 1)0) + sin(-2k9)) and 
sin (0) ELi sin[(2/c - 1)0] = ^ELi [cos(2(/c - 1)0) - cos(2fc0)], it is easy to see (i) for 
= |-7r, ^Li cos[(2A; - 1)0] = 0; and (ii) for = ^tt, £* =1 s[n K 2k ~ l ) 6 \ = °- 

We consider an even m first. Let = 7r . It is clear that — /ife = cos ((2k + 1)0) and 

2m 

uj k = sin ((2k+l)0) for all fc = 0, • • • ,m/2-l. Hence A k in (H7J) becomes A k = M ((2k + 1)9), 
where M(-) G SO(2) is given by 



M(-) 



cos(-) sin(-) 
-sin(-) cos(-) 



(22) 



Thus (A k y = M(j(2k + 1)0). Let A?, denote the ith row of A e and rji = (2i - 1)0. Hence, 
= ( cos (77*) sin(r/j) cos(3r/j) sin(3r/j) ••• cos((m - l)r/j) sin((m - l)^) ) . 
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Therefore, Af, (Af 9 ) T = ^, and if % ^ j, then 



2 



A !.( A %f = J2[co S ((2i-l)(2i-l)e)co S ((2l-l)(2j-l)e) 

+ sin((2£- \){2i - 1)6) sin((2^ - l)(2j - 1)0) 

m m 
2 2 

= ^2coB(2(2£-l)(i-j)9) = ^coB((2£-l)(i-i)-) = , 

i=i t=\ 

where the last step is attained from (i). This shows that A e (A e ) T = t^I. Thus A e is invertible 
so that equation (|18p has a unique solution. 

We then consider an odd m. In this case, — fik = cos (it — ^) and ujk = sin (ir — ^) for 
k = 1, • • • , (m — l)/2. Let 7 fc = 7r — Then the ith row of A° is given by 

A°, = ( cos((2i - 1)tt) cos((2i - l) 7l ) sin((2i - l) 7 i) cos((2i - 1) 72 ) sin((2i - 1) 72 ) 

cos ((2i — lh m-i ) sin ((2i — l) 7 m-i 



Let denote the ith column of A . Clearly (A° 9l ) T A° 9i > 0. For i / j, either (A° i ) T >l° j = 
£r=iCos((2fc-l) 7fl )cos((2fc-l)7i) withs + t or (^) T ^. = £™ x cos((2fc-l) 7s )sin((2&- 
l) 7 t), for some s, t € {1, • • • , ^^p"}- Since 

Y cos ((2A: - l) 7s ) cos ((2k - l) 7t ) = - ^ [cos((2A; - l)( 7s + 7t )) + cos((2A: - l)( 7s - 7i )) 
fe=i fe=i 

m m 

cos ((2* - 1)7.) sin ((2* - l) 7t ) = - ^ [sin((2fc - l)( 7s + 7t )) + sin((2A; - l)( 7s - 7t )) 



k=l k=l 



we conclude that (A°^ A° t - = by using (i)-(ii) established at the beginning of the proof. 
This shows that (A°) T A° is a diagonal matrix with positive diagonal entries. Therefore A° is 
invertible and equation ([2T]) has a unique solution. □ 

The following proposition show that L and P derived above yield the equivalent kernels. 

Proposition 3.1. When (3 = 1, L(\t\) in / T75j) and P(\t\) in / f7g|) are 2mth order kernels 
respectively. 

Proof. We consider only since the other case follows from the similar argument. We 

shall show that L(\r\)dr = 1 and T k L(\r\)dT = for all k = 1, ■ • • , 2m - 1. This 
holds true trivially when k is odd. For an even k, by observing L( 2m ) + /3 2m L = (with 
(3 = 1), we have 

/oo roo roo 

T k L(\r\)dT = 2 T k L(r)dr = -2 r fc L( 2m )(r)dr. 
-oo JO JO 

Repeatedly using the integration by part, we deduce 

j\k L {2m)^ dT =J2 ^ (-1)*^-* ft) _ L (2m-l-») 

In light of (|16p . we obtain the desired result. □ 
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Figure 1: The equivalent kernels for m = 1,2,3,4: (i) m = 1: the dashed line; (ii) m = 2: 
the dotted line; (hi) m = 3: the dashdot line; (iv) m = 4: the solid line. 



Example 3.1. As an illustration, the closed-form expressions of the first four equivalent 
kernels are given below and their plots are shown in Figure [TJ respectively. 



1: K(t) = \<T® 



m = 2: Kit) = — ^ e ^ ( cos + sin -^L ) 

m = 3: Kit) = -e~ w + e~ 2 1*1 ( - cos — + — sin — — 
v 7 6 V 6 2 6 2 

m 



4: 2f(t) = e — 0-9239|t| ^ 2310 C os(0.3827|t|) + 0.0957sin(0.3827|t| 

0.0957 cos(0.9239|t|) + 0.2310 sin(0.9239|t| 



-0.3827|t| 



3.4 Boundary conditions 

Recall that the boundary conditions for the ODE (JHJ) are F®(0) = 0, F(*)(l) = GW(1), 
i = 0, • • • , m — 1. In the following, we consider an even m first. In this case, the homogeneous 
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ODE: i^ 2 " 1 ) + p 2m F = has the following 2m (linearly independent) solutions: 

e-^cosC/fofci), e-^ fet sin(/3tJ fc t), e ~^ k ^ cos((3oj k t), e - ^ 1- *) sin(/3w fc t), 

where fc = 0, • • ■ , ?r — 1 and [A k ,u) k > for the above fc. The solution to ODE flT2J) subject to 
the boundary conditions can be written as 



F(t)= f L(\t- s\)G(s)ds+J{t), 
Jo 



(23) 



Fo(t) 



where 



2 

= Yl {e^ ht [a k cos(Pu; k t) + b k sm(Pu; k t)] 



+ e 



-p» k (i-t) r + 



k=0 



[al cos((3u;kt) + b\ sin(/3u; fc t)] j. 



(24) 

and the coefficients ak, bk, a£, b~^ are to be determined from the boundary conditions, and the 

kernel Lis given in {J§J. Define ||G|| = sup \G(t)\. Let G = (||G||, G(l), G'(l), ■ ■ ■ , G^ -1 ) (1)) , 

te[o,i] 

and 

. 7' 

(25) 



be the coefficient vector. 

By making use of the boundary conditions, we obtain the linear equation B e a = v, where 
v T = [v ,vi], 



v 



*b(0), 



P 



1 1 



F (m - 1} (0) 



m—l 



Vl 



m)+ G(1) , « .... -Fr-"(i) + c""-'(i) 



/3 



J J 



/3 



m—l 



and 



°ll n 12 
r>e r>e 
,- D 21 "22. 



Here the matrix blocks Bfj £ j^ m >< m are obtained via the similar technique in Section [3.11 as 



10 10 

cos(?7i) sin(??i) cos(3?yi) sin(3?7i) 

_cos(?7 TO _i) sin(ry m _i) cos(3?7 m _i) sin(377 m _i) 



where rjk = k(ir — nL), k = 1, • • • , m — 1, and 



1 

cos((m — 1)771) sin((m — 1)771 ) 

cos((m - l)?7 m _i) sin((m - l)?7 ro _i)_ 



(26) 



^22 



cos(^ ,o) sin(^Q,o) cos^i.o) sin(^i )0 ) 
cos(^o,i) sin^o.i) cos(V>i,i) sin(<0i,i) 

COs(?/>0,m-l) Sin^o.m-l) COS(^l,m-l) W0.(lpl, m -l) 



cos(^ffi-i,o) sin(V>m-i,o) 
cos(^m_i,i) sin(i/i^_i i i) 

cos(V)m-i, m -i) sin(^im_i )m _i) 
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where rj£ = for k = 0, • • • , m - 1, and ip jjt = f3uj j + (2j + 1)77^ for all j = 0, • • • , 
0, • • • , m — 1, and each entry of B\ 2 and 5| x is of order 0(e 



Lemma 3.2. Given an even m. There exist positive real numbers (3* and g, dependent on m 
only, such that for all (3 > (3*, the coefficient vector a is unique and satisfies ||a|| < g \\G\\. 

Proof. Note that for (3 sufficiently large, each element of B\ 2 and B 21 is sufficiently small. 
Hence it suffices to show that B\ x and B 22 are invertible. For this end, let B^{k) denote 
the fcth column of Bf v Define C n = [B{ x {2) Bf^l) Bfj(4) 5^(3) • • • Bf^m) B\ x {m - 1)] . 
Letting d = it can be verified that 

#n + iC u = 

1 2 1 I 1 I 

_£-«> _ le "? -e"* 3 ' 5 -ze' 3 '' _ e - 8 (m-l)j9 _ le »(m-l)'5 

^ g — f&^m — 1 ^/ ^ii?^m — 1 / g — i3i9^m — 1 ^ ^z3i9^m — 1 / ^ — lira — 1)$ \ m — 1 ^/ gi(m — ^m — 1 

Therefore -Bf x + %C\\ can be written as diag(l, i, ■ ■ ■ ,l,i)V, where V is an invertible 
Vandermonde matrix. This implies that B xl + %C\\ is invertible. On the other hand, by 

noting C\\ = B X1 J, where J = diag (J*, • • • , J*) with J* 



1 



, Bl 1 +iC ll = B{ x {I + iJ). 



copies 

It is easily seen that / + i J is invertible, so is B\\. To show the invertibility of B 22 , it is 
noticed that B 22 = B 22 R, where B 22 is similar to B X1 defined in (|26p with 77$ replaced by 
77+ and # = diag(M(/?uo), Af(/3o;i), • • • , M(/3o>™.)) where M(-) is given in ([22]). Clearly i? is 

invertible for all (3, and it can be proved in the similar way as for Bf± that B 22 is nonsingular. 
Hence, B e 22 is invertible for all f3. Consequently det(5 e ) = det(Sf 1 ) det(B| 2 ) + 0(e" m/3 ) / 
for all (3 sufficiently large. In addition, since each entry of the adjoint of B e is bounded, we 
deduce that (B 6 )^ 1 = l^rg^y is bounded and the upper bound depends on m only, where 
ad}(B e ) stands for the adjoint of B e . Furthermore, letting n = maxfc(|cfc|, \d k \), where c k ,d k 
are the coefficients in the kernel L, and g = mm{fi k , k = 0, • • • , y — 1}, we have, for t* = 
or 1, 

\F^(U)\ 

ft 

As a result, the equation B e x = v has a unique solution a that satisfies the desired bound. □ 



< 2mn [ /3e" PQT dr\\G\\ <2mK/g\\G\\, V j = 1, • • • , m - 1. 
•A'i 



Consider an odd m. The homogeneous ODE: i^ 2 " 1 ) - /? 2m F = has the following 2m 
(linearly independent) solutions: 



,±/3t 



< e-^ kt cos((3uj k t), e-^ kt sm(Puj k t), e ~P^-t) cos (f3uj k t), e'^ 1 ^ sm((3u k t), 

m—l 



where k = 1, ■ ■ ■ , V[L 2 — and fJ- k ,uj k > for the above k. The solution to ODE ([LI]) subject to 
the boundary conditions can be written as 

F(t) = C P{\t-s\)G{s)ds+J(t), (27) 
Jo 

V v ' 

^o(t) 
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where 



J(t) 



a e 



-fit 



+ ale 



-jS(l-t) 



m — 1 
2 



+ 



+ e 



^ | e -^fe* [ 0fc cos((3uj k t) + 6 fc sin(/?w fc t)] 
" 1 

[d£ cos(Pu k t) + 6+ sm(f3uj k t)] }, 



fc=i 



-Pn k (i-t) r + 



(28) 



and the coefficients a^, b k , ajj", 6^ are to be determined from the boundary conditions, and the 
kernel P is given in (|19p . Let 



(29) 



b — (ao,ai,6i, ■ • • , tt m-i , b m -\ , a , a t , bf, • • • , fl j-i , fo m-i ) 

\ 2 2 2 2/ 



be the coefficient vector. Similar to the case where m is even, we obtain the linear equation 
B°h = v, where v T = [vq, vi] and 



B° 



no r?o 
-°21 D 22 



Here the matrix blocks Bf- £ 



are obtained via the similar technique in Section 13.21 as 



B? 



ii 



1 

-1 



1 

cos (71) 





sin (71) 



1 







(-1) 



m— 1 



cos((m — 1)71) sin((™ — 1)71) 



cos(t ™-i ) sin(7 m-i ) 

cos((m — l)7 m-i ) sin((m — 1)7^-1 ) 



(30) 



where 7fc = (vr-^),A: = 1,2,- •• 



TO— 1 

' 2 



, and 



5 22 



cos(Ci,o) 
cos(Cii) 



sin(Ci,o) 
sin(Ci,i) 



COs(C m-l n ) 
2 ' 

COs(C m-l 1 ) 



sin(C m-i n ) 

2 > u 

sin(C m-i x ) 



1 cos(Ci,to-i) sin(Ci, m -l) 



cos(C 



where 7+ = f^, Cm = + ^ + for all k = 1,2, 



m— 1 
2 

m— 1 



to — 1 ^ 



sin(Ci 



jm—\i 



0, 1, • • • , m — 1, and each 



! 2 ' ' 

entry of -B° 2 and is of order 0(e~P). To show the invertibility of B°±, we introduce 
En = [0 B?i(3) Bft(2) 5^(5) Bfi(4) • • • Sfi(m) Sf^m- 1)] , where Bft(lfe) denotes the fcth 
column of 5^. As before it can be shown that B°± + iEn is nonsingular and + iE\\ = 
B°^(I + iK), where K = diag(0, J*, • • • , J* ) and J* is the 2x2 matrix defined before. Since 

-copies 

/ + is nonsingular, so is B\ v Furthermore, by applying the similar technique, we can 
show that -B22 is invertible for all 0. This thus implies that for all (3 sufficiently large, B° is 
invertible and each entry of (i?°) _1 is bounded by a positive number depending on m only. 
We summarize the above discussions as follows: 

Lemma 3.3. Given an odd m. There exist positive real numbers (3* and g, dependent on m 
only, such that for all the coefficient vector b is unique and satisfies ||b|| < g||G||. 
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4 Asymptotic properties of P-splines 

To establish the asymptotic properties of the estimator, we first represent F m as the sum of 
the convolutions of K (t, s) (defined in Section [3|) with G m and a remainder term that is of 
smaller order. 



Lemma 4.1. The F m in h!3\) can be represented as 

F m (t)= [ K(s,t)G m (s)ds+ [ K(s,t)R(s)ds + J(t), (31) 
Jo Jo 

where J(t) is given by \2J$ and 128\) for even m and odd m, respectively. The \\ ■ ||oo -norms 
of both coefficient vectors a in (d3J) and b in (2D)) are stochastically bounded, and \\R\\ = 

Proof. The representation of F m in (|3ip follows from the discussions in Section[3l The stochas- 
tic boundedness of the coefficient vectors is the direct applications of Lemma [3. 21 and Lemma 
[331 Let y = ^f L X T y and A = X*K n /n. Claeskens et al. (2009) showed that ||-H" _1 ||oo = 0(1), 
where H = ^-X T X + A-D^D m . Thus, b is stochastically bounded, so is />L Let b solve 
(X T X + X*Dj n D m )b = X T f and denote f(x) = £f=i +P b k B^\x). We have 



||/ b] -/|| < ||S-6||oo < Halloo || V-m Hoc = Op(/^V21og2TnV ( 32 ) 



It is shown that || / — / || = 0(a) if p = m. The development of this result is a special case of 
Theorem 14.11 in Section HJ Thus, 

\Fx(x) - Gx(x) + Gx(K kx+p ) - Fi{K-k x+P )\ 
< \A(x) - Fi(x)| + \(Gt(K kx+p ) - Gt(x)) - ($i(K kx+p ) - $x(x))| 

+ |($ 1 (^ +p ) - $i(x)) - (F!(K kx+p ) - F!(x))\ + \(Fi(K kx+p ) - F 1 (x)) - (A(«fc»-Hp) " A( 

2 u£u , ^ ( 1 \ , P M n uI ... , pM n . 



< + 0,(1) + ^n/ - /n + 2^||/ - /I 

n \\nK n J n n 



A similar rate can be obtained for \R y ^ kx+ i — Rf >kx +i\- Given the admissible ranges of K n 
and a in next Corollary 14. 2\ Op{(\ogK n /nK n ) 1 / 2 ) is the dominating term. Hence, the lemma 
follows. □ 

Theorem 4.1. // the true regression function is 2mth order continuously differentiate with 
bounded 2mth derivative, then the P-spline estimator f^ can be written as 

1 n 

/M(t) = f(t) + (-l)^ 1 af^\t) + o(a) + -Y i K(t,t i )ei (33) 



n 

i=l 



uniformly in a and in t S (0, 1). 
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Proof. Taking the mth derivative of Jq K(s,t)G m (s)ds, we obtain 



1 d m K{t,s) 
dt m 



G m (s)ds 



1 d m K{t,s) 



+ 



dt m 
1 d m K(t,s 



$ m (s)ds + 



1 d m K(t,s) i 



G m (s) - $ TO (s) 



ds 



o 



ds. 



It is easy to show that 



jf ^Q^^ Mds = -J dK ^ S h m (s)ds = -S> m (l)K(t, 1) + £ K(t, s)$ m ^( S )ds. 



dt 

Therefore 



f 1 d m K{t,s) 
Jo ~ dt n 



m— 1 



$ m ( s ) ds = _^$ m _.(l)__^l) + ^ K(t,s)f(s)ds 



j=0 



By Equation (6.4) in Theorem 2.2 of Nychka (1995), we have 

K(t,s)f(s)ds = f(t) + (-ir-V (2m) (*) + o(a). 

Jo 

Similarly, 

-1 d m K(t,s) 



m—l 







At* 



G m (s) - $ TO (s) 



j=0 

l 



+ / K(i,s) dGi(s) -d$i(s) 



Moreover, 



1 d m K(t,s) 
dt m 



ds 



< ll$ m - $ m . 



1 d m K(t,s] 
dt m 



ds 



which is of order 0{l/n)j3 m . Finally, in light of Lemma 14.11 ||Jpr Jq K(s, t)R(s)ds\\ is of 
order {log K n / nK n ) l l 2 (i rn . It is easy to verify that the mth derivative of J(t) is of order 
e - /3t(i-t) pm _ ^his completes the detail of the representation. □ 

Remark 4.1. Theorem 14.11 indicates that the P-spline estimator is approximately a kernel 
regression estimator. The equivalent kernel is K(t, s) given in Section [31 and a plays a role 
similar to the bandwidth h. The asymptotic mean has the bias (— l) m ~ 1 af <y2m \x), which can 
be negligible if a is reasonably small. On the other hand, a can not be arbitrarily small as 
that will inflate the random component. The admissible range for a given in Corollary 14.11 is 
a compromise between these two. 
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Corollary 4.1. Let a satisfy a n 2m /( 4m+1 ) and a -(2m-i)/2m log j^/^ _> q. Suppose 
also that the true regression function f is 2mth order continuously differ entiable with bounded 
2mth derivative. Then for t £ (0, 1), 

^[f lm] (t)-f(t)]^ d N{0,a 2 K (t)), (34) 

where 4 K 2 (t, s)ds — > cr^(t) as n — > oo. However, if a = c 2m n~ 4m + 1 /or c > 0, and Zei 
ET n ~ n 7 luzf/i 7 > (2m — l)/(4m + 1), i/ien 



n 2m/(4m+l) [/M^ _ ^ ^_ ^m-l^m j(2m) ^ ^ ^ (35) 

Proof. Let II(i) = - X^=i *i) e «- For any fixed i, the Lindeberg-Levy central limit theorem 
gives 

in distribution, where ^ Jq K 2 (t, s)ds — > o\(t) as n — ► 00. If a satisfies a n 2m /( 4m+1 ' — ► 
and a-^ 2m -^/ 2m log K n /K n -» 0, it is easy to see that the remainder terms in (j33[) are 

o — ^ m 

Op(l). If a = c n 4m + 1 for c > 0, and K n ~ n 7 with 7 > (2m — l)/(4m + 1), we have 
./nT/to = c 2m+1 / 2 and y / n/Pyf^^P m -» 0. The theorem follows. □ 

Remark 4.2. The asymptotic results in Corollary 14.11 provide theoretical justification of the 
observation that the number of knots is not important, as long as it is above some minimal 
level (Ruppert, 2002). It is easy to find that the mean squared error of the P-spline estimator 
is of order n - im /' im + 1 ^ which achieves the optimal rate of convergence given in Stone (1982). 

In the following, we study the asymptotic property of (t) = Ylk=i P ^fc-^jf ' (*) wnen 
p 7^ m. We first define a piecewise mth degree polynomial f^ m \ where f^ and /l m l share 
the same set of spline coefficients. In particular, define /' m ' (t) = X^fc^d +m ^fc-^i (*) ^ P > m , 
or /H(t) = Ef=i +P ^-^l ml (i) if P < m. Note that, if p < m, /M is defined on [0, 1 - 
Following the similar discussion as above, we can establish the asymptotic distribution for 
/[ m l as in (|34|) and (|35|) . respectively, under different admissible ranges of a and K n . 

Lemma 4.2. For any t G (0,1), Ze* d = [K n t\ + 1. Let <y(t) = /M(t) - / [m] (*)- T ^n ; i/ 
p > m, 



q=m+l i=d+l 

and if p < m, 

m d+m 



III w 

w) = - E E (^(<-^))^ 1 koE^i-^|r/ [mI (*). ( 3? ) 

g=p+li=d+l ^ Z=l 

where the coefficients {ay} and are constants. 
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Proof. The B-spline basis functions have the recurrence relationship such that 



Bf{t) = ^(t - n^Bf-^t) + ^ - t)Bf-\t). 

Let f [p ' 1] {t) = Ef=i +P_1 hB [ ^ 1] {t) with the same first (K n + p - 1) coefficients of /M. For 
x S Kd+i), the difference between f^ p \t) and fl p ~ l \t) is given by 

z — «■ L p p ' J 



E (6m-^)(^(t-^- P ))^ 1] W- 



(38) 



i=ci+l 



From (|38|) . if p > to, 



V d+q 



q=m+l i=d+l 



Kn 



From ([2]), we have A'fr fc = (A6fc_j+i, A6fc_j +2 , ... , A6 fc ), where 



Z - 1 
1 



Z - 1 
Z - 1 



Combining this with (|7|), it is easy to show that there exists S M. pxp such that 

T 



Ab d+2 , Ab d+2 , ... , Abd+p+i 
Hence, we can write 



Aft 



E«H^^/ [P] W, fc = 2,...,p+l, 



(39) 



which gives (|36|) . (j37[) can be established similarly. Thus the lemma follows. 



□ 



Corollary 4.2. Suppose that f is 2mth order continuously differentiable with bounded 2mth 
derivative on [0, 1]. Let a satisfy a n 2m /( 4m+1 ' -> and a -(2m-i)/an* log ii^/i^ -> 0. TTien, 
/or t e (0, 1), 



[/b]( t )_/(t)_^( t )] ^iv(o,o-|(t)), 



(40) 



where 7(i) zs given by A36\) if p < m or |3?| ) i/p > to. However, if a = c n 4m+1 /or c > 0, 
and Ze£ i"C n ~ n 7 imi/i 7 > (2to — l)/(4m + 1), then 



n 2m/(4m+l) _ _ 7 ( t )] _>d jy f(2m) ^ 



(41) 



Remark 4.3. When p is not equal to m, the asymptotic bias has an additional term 7(i), 
which is of order O p (l/K n ). When K n grows sufficiently fast with respect to n, this term is 
asymptotically negligible. 
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5 The equivalent kernels near boundary 



The approximation of the equivalent kernel K(t, s) deteriorates when t is near the boundary 
points of the design set. In this section, we derive an explicit formula for the equivalent kernel 
when t is close to the boundary. We discuss the case when t is close to only; the case when 
t is close to 1 follows from the similar argument and thus is omitted. 

Consider an even m first. It follows from the closed- form expressions (|23p and (|24p for 

F m (t) that for t G [0, 1] sufficiently small, the m-th derivative of J2k=o e"^^ 1 ™') [a£ cos(/3oj k t)+ 
sm(/3ujkt)] is of order O p {f3 m e~ 13 ). Hence, we only consider 



F(t) = F (t) + e~^ kt K cos{pu k t) + b k sm((3uj k t)] . 



(42) 



fc=0 



In the subsequent, we shall express the coefficients a k , b k in terms of Fq(0) and its derivatives. 
This will eventually lead to an explicit expression for the kernel. 
In view of (|15p . we have 



1 dL^(\s-t\) 



dP 



G( s )ds = (-iy 



1 az#)( f 

d si 

m ^ 



-G(s)ds, VJ = 1, 



,2m- 1. 
(43) 



Moreover, it follows from Section I3TT1 that L(s) = (5 Pk{s) = [l 0] 



k=0 



k=0 



Pk{s) 
Qk( s ). 



where 



Pk{s) 
Qk( s ). 



and S(-) e SO(2) is given by S(-) 

S3) I 



cos(-) sin(-) 
— sin(-) cos(-) 



p\ 3 \s) 



(PA k ) 



As a result, we obtain 



k=0 



Pk{s) 
_Qk{ s ). 



In light of (|17p . we have 
{l3A k ye-^ s S{uo k f3s) 2 



P +1 Y, (A k ){.e-^ s S{u k (3s) 



k=0 



where {A k )\ 9 denotes the first row of the j'-th power of A k given in (|17|) . This, along with 
E2), yields 



Jo fc=0 

E /' S{u> k ps) 

k=o Jo V 



G(s)ds 
G(s)ds. 
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For notational simplicity, let B\ x denote the inverse of the matrix B\ x defined in (|26p and let 



d k 



G(s)ds, j = 0, • • • , m — 1, 



and p = [po, Pi, • • • , p m -i] T S K' m . Therefore, it follows from the development in Section^ 
that 

a 



bo 



2 1 



■(B 1 e 1 + O p (e^))p = -Bf lP + O p (e 



Returning to F^ m \t) and using (HU), we have, for (3 —> oo, 



(44) 



£=0 



I 



1 dL( m )(\t-s\) 



G(s)ds + /? m q T (t)( -I^lp, 



<9s m 



(45) 



where q(i) = [q (t), qi(i), • • • ,qm_i(t)] G M m with q £ (t) = (A e )f,e-^ et S(uj e (3t) G M lx2 for 
£ = 0, 1, ■ • ■ , f - 1. 

To find the kernel in this case, particularly the kernel for the second term, recall 



Therefore, the second term in (1451) becomes 



Pk{s) 
Qk( s ). 



P m q r (t)(-Bf l )p = I W(t,s)G(s)ds 



where 



^o(s) 
vi{s) 



W(t,s) = P m q T (t) ( - B\ 



v m -i(s) 



and u 3 (s) = {~A k ){. (j3 ) , j = 0, . . . ,m - 1. 



fe=0 

Denote by p k {s) and <zj^(s) the r-th order integrals of p k (s) and q k (s) respectively, 
namely, 

P k T \s) = J ■■■ J p k dri---dr r , q k r \s) = j ■ ■ ■ j q k dn ■ ■ ■ dr r . 



r-copies 



r-copies 



18 



In light of (fTTj) . it is easy to verify that 

(PA k 

)] 

d m p[ m \s) 



Using this and 



Pk{s), 



ds m 

m ^ 

E(-^)i.(/3 



k=0 
gm 

ds m 



r 


Pk(s) 






_Qk(s)_ 






>) 




ds m 




~d m p { ™\s) 




ds m 






L ds m J 



-^ s {(3A k )- r S{u k ps) 
Qk(s), we have 

m -| 



4 



3s r 



fc=0 



_(m) / x 

— (m) / \ 
% ( S ) 



E (-Ak){.(Pe-^ s (pA k )- m S(u k p t 



k=0 



Therefore, 



W(t,s) 



d" 



ds m 

IT 



q T (i) ( - %) r( S ; 



where r(s) = [ro(s), ri(s), • • • , r OT _i(s)] G K m , and 



k=0 



C k 

d k 



j = 0, ...,m-l. 



(46) 



Here the coefficients c& , c4 satisfy the linear equation (|18p . Finally, we obtain the equivalent 
kernel for t > sufficiently small (when (3 — > oo) as 



s) = L(|t -s\) + q T (t) ( - r(s) 



(47) 



Example 5.1. As an illustration, we derive the closed-form expression of the kernel near the 
boundary t = for m = 2 and compare it with the boundary kernel established by Silverman 
(1984) for the smoothing splines. Since cq = do = — j=, 



A = 
we have 



cos(^) sin(f) 
-sin(¥) cos(f) 



cos(^) sin(|| 
-sin(^) cos(^ 



B 



ii 



1 ' 

1 y/2 



q(t) 



e ^ 



cos(^) 



r(t) 



/3 -i± 
e V2 



2^ 



• 0s 
V2 



Hence, the equivalent kernel near the boundary t = is 



K b (t, s) = L(\t - s\) + q T (t) - B e u r(s) 



P 



(48) 







P P 

COSl 772^ _s ^ + 2cos (-^| t ) cos V7?| s ) _sin (7/|(* + s )) 
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Figure 2: The non-boundary kernel (solid), the finite-sample kernel (dashed) and asymptotic 
boundary kernel (dotted) for m = 2 and for (a) ,0 = 4, (b) = 6, (c) (3 = 8, (d) (3 = 10. The 
kernels are for estimation at x = 0.2. 
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When t = 0, since L(\t — s\ 



e -S2 S [ cos ( - J^ s ) + sin ( s ) ] , the b oundary kernel b ecomes 



V2e ^cos(^s), s G [0, 1]. It is interesting to notice that the boundary kernel in (|48p agrees 
with that obtained by Silverman (1984). Figure[2]displays the non-boundary kernel, boundary 
kernel, and the finite sample kernel when we estimate x = 0.2 with different choices of (3, 
where the finite sample kernel is obtained by incorporating the terms containing e - ' 3 ^ 1- ') 
ignored in (|42p . Indeed, this kernel is given by 



cos(^L(t-s)) +2cos(^t) cos(^s) - S in(-^(t + s)) 
^-^{ cos + I ) [oofl(^l) - sin(^))] + y2cos(^i) cos(^))}. 



There are a good agreement between the finite-sample and asymptotic kernels when = 6 
and an excellent agreement when (3 = 10. 

The development of the boundary kernel for an odd m is similar and we omit the details 
here. For notational simplicity, let denote the inverse of the matrix B°± defined in (|30p . 
we obtain the equivalent kernel for t > sufficiently small (when (3 — * oo) as 



K b (t, s) = P(\t -s\) + q T (t) - B^)r(s 



(49) 



where q(t) = [(-l) m e ~' 3t , qi (t), . . . , q™^ (t)] G M m and r(s) = [r (s), n(s), . . . , r m _i(s)] T G 
R m , and 



q«(t) 



)lx2 



1 



m— 1 
2 



r .( s ) = ^ ircof3e -Ps + Y^{-A k )^ m - 3) f3e-^ s S{u: k f3s) 



k=l 



j = 0,...,m- 1, 



where the coefficients , dfc satisfy the linear equation (|2~Tj) . 



6 Extensions to unequally spaced data and multivariate smooth- 
ing 

We have so far focused on the equally spaced design case and equally spaced knots. When 
the design is not equally spaced, one can use the ideas of Stute (1984) and Li and Ruppert 
(2008). In specific, assume that Xj's are in (a, b). Find a smoothing monotone function T 
such that T(xi) = i/n from (a, b) to (0,1). We use the P-spline smoothing to fit (i/n,yi), 
and thus the regression function is give by / o Y~ x . We place knots at sample quantiles so 
that there are equal numbers of data points between consecutive knots. 

The univariate P-splines can be naturally extended to multivariate P-splines (Marx and 
Eilers, 2005). The asymptotic properties can be studied along the same line. Consider 
the problem of estimating the u dimensional function f(t\, . . . ,t v ) from noisy observations 
Hi = /(iii, • • • , t u i) + €i, i = 1, . . . ,rt. The P-spline model approximates / by 

f(h,...,t v )= £ ••• £ b hl ,..., kv B^\t x )---B^\t v ). 

k\=l k v =\ 
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The spline coefficient b subject to the difference penalty are chosen to minimize 

n K ln +p! K dn +p d 

Efc- E ••• E ^,.^< ] (^)---< d] (^)] 2 

i=l fei=l k d =l 

K ln +pi K dn +p d 

+ A* Yl ••• E [A™'-'^W 2 ,..., fe J 2 , 

fel=mi+l k d =m d +l 

where the difference operator for d dimensional case is defined as follows: 

A°'-' 6fe lv .. )fciy = b kl ^ kv , k 1 = l,...,K ln +p 1 ,...,k„ = l,...,K 1/n +p 1/ , 
A mi ' ma m ^ fellte ,..., fe = ^ mi ~ l ' m2, '"' mv bk 1 ,k 2 ,...,k v - A mi ~ 1 ' m2 ''"' m "6 fcl _i i fc 2i ... i A: I/ 

= A mi,rarl '- ,m " hi,k 2 ,...,k„ ~ A mi «- 1 - m 1 fcl)fcrl ,,t 



= A™,..,™,-!^ ^ - A^^>-' m -- 1 6 fcl _ lifc2) ... ifc „_ 1 . 

For example, consider a two dimensional difference operator when k\ = 1 and &2 = 2: 

A 1 '^ = A°' 2 6 fcs - A°' 2 6 fc _ liS 

= [hs - 26fc iS _i + 6fe,s-2] - [&fc-i,s - 26fc_i iS _i + bk-i, s -2\- 

Let X be the n x {IT^if^ matrix with (i, j)th entry equal to B^\t u ) ■ ■ ■ B^"\t di ). 

Define D as the {Hj =1 (Kj n + pj — rrij)} x \Il u - =1 {Kj n + pj)} differencing matrix satisfying 



/ Ami,...,m„L \ 



Db = 



\ \mi,...,m v u 

\ UK ln + Pl ,...,K„ n +p v J 

The optimality condition is given by 

(X T X + \*D T D)b = X T y. 



(50) 



D mv and D T D = D^D mi D* 2 D m2 ® • • • ® D^ D^, 



r 



Note that Z) = D mi (g) D m2 ® • • 
where "<8>" represents the Kronecker product. We may go though the same procedure as 
described in this paper. The multivariate P-spline smoothing is asymptotically equivalent 
to kernel smoothing and the equivalent kernel is the Green's function corresponding to the 
partial differential equation (PDE): 

f>2m\-\ h2m d 

( _ iri+ ... +mda _____ F(ti; _ ^ td) + p {tu >trf) = G{tu td)i (51) 

subject to the boundary conditions: 

Qki+---+k d 

—r — r-F(ti, ...,t d )=0, if any U = 0, h = 0, . . . , m, - 1, 

dt kl ■ ■ ■ df d d 

Qki^ hfcd Qk-i^ \-k d 

-F(t l , . . . ,t d ) = —r — u-G(ti, . . . ,t d ), if any U = 1, fej = 0, . . . , - 1. 



atf 1 • • • dt k d d ' 



dt ki ■ ■ ■ dt k d d 



Further study of this issue is beyond the scope of this paper and shall be reported in a future 
publication. 
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